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A Pessimistic Approximation for 
the Fisher Information Measure 

Manuel Stein and Josef A. Nossek 


Abstract —The problem how to determine the intrinsic quality 
of a signal processing system with respect to the inference of an 
unknown deterministic parameter 9 is considered. While Fisher’s 
information measure F(9) forms a classical analytical tool for 
such a problem, direct computation of the information measure 
can become difficult in certain situations. This in particular forms 
an obstacle for the estimation theoretic performance analysis 
of non-linear measurement systems, where the form of the 
conditional output probability function can make calculation of 
the information measure F(9) difficult. Based on the Cauchy- 
Schwarz inequality, we establish an alternative information 
measure S(9). It forms a pessimistic approximation to the Fisher 
information F(9) and has the property that it can be evaluated 
with the first four output moments at hand. These entities usually 
exhibit good mathematical tractability or can be determined at 
low-complexity by output measurements in a calibrated setup 
or via numerical simulations. With various examples we show 
that S(9) provides a good conservative approximation for F(9) 
and outline different estimation theoretic problems where the 
presented information bound turns out to be useful. 

Index Terms —estimation theory, non-linear systems, Cramer- 
Rao bound, experimental design, minimum Fisher information, 
worst-case noise, squaring loss, hard-limiter, soft-limiter. 


I. Introduction 

Suppose we are given a parametric system, characterized 
by a probability density or mass function q(y; 9), and face the 
problem of inferring the deterministic but unknown system 
parameter 9 G 0 from measurements at the system output Y. 
The output Y takes random values y G y, where y denotes the 
support of the random variable Y. Estimation theory CD, G) 
provides a variety of tools for this kind of problem: On the one 
hand, guidelines for the design of high-performance processing 
algorithms and on the other hand corresponding performance 
bounds m-m. While the latter have originally been derived in 
order to benchmark different estimation algorithms, establish 
efficiency or identify potential for further improvement, these 
error bounds have become popular as a figure of merit for the 
design and optimization of the measurement system q(y;9). 
Such a problem arises frequently in the field of signal process¬ 
ing, where not only the efficient extraction of information from 
noisy data is within the interest of the engineer, but also the 
design of the physical measurement system q(y : 9) itself. Note 
that the layout of the measurement system can significantly 
influence technical properties like computational complexity, 
power consumption, production cost, reliability, processing 
delay and system performance. Therefore, given the ability 
to modify the data gathering system q(y ; 9) to an alternative 
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design p(z; 9) with the altered output Z, exhibiting realizations 
z G Z, a rigorous method is required in order to draw a precise 
conclusion about the intrinsic quality of the original system 
q(y ; 9) and the envisioned modification p(z ; 9) with respect 
to the problem of deriving a high performance estimation 
procedure 9(y) or 9(z). Here y G y N and 2 : G Z N denote a 
collection of N independent realizations of the system outputs 
Y ox Z. 


A. Estimation and Information Measures 

We restrict the discussion to unbiased estimation algorithms 

J 0(y)q(y;0)dy = 9 (l) 

and assume that the system q[y ; 9) is differentiable in 9 G 0 
for every y G y N , where the parameter set 0 is an open 
subset on the real line R. Further all considered systems exhibit 
regularity, such that the statement 

J f(y)q(.y;8)dy = J f(y) dq ^ 9 ^ dy c 2 ) 

holds for any function /(•) which does not present 9 as an 
argument. Using <jT]) and ([2} we can set out that 

f St ^ d( i(y\0) 

J 9 ^ 89 Ay = ' (3) 

With the requirement 

J g(y; d)dy = 1, \/9 g 0, (4) 


it follows that 


— J q{y;9)dy = 0, 
such that we expand (|3} by 


V9GO, 


(5) 


iv=i. 

Using the fact that 

d In q(y; 9) = 1 dq(y;9) 

d9 ~ q(y-9) d9 ’ 
equation ([6]> is manipulated, resulting in 

JiHv) - 9) dhiq QQ ,9 \ {y, 0)d y = i- 

For two real-valued functions /(•) and gf) the Cauchy- 
Schwarz inequality ED states 

J f 2 (x)p(x)dx j g 2 (x)p(x)dx > {^J f(x)g(x)p(x)dx 

( 9 ) 


( 6 ) 


(7) 


( 8 ) 
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where equality holds only if 

f(x) = Kg(x) + X, \/x € X N (10) 

with constant k, A £ I. This allows to derive the inequality 

J(9{y) - d) 2 q{y,d)dy > (^ j <l(y\ d )dyj 

( 11 ) 

from expression ©. As long as the observations are indepen¬ 
dent and identically distributed, i.e., as long as it is possible 
to factorize 

N 

q(y\Q) = Y[q(yn,d), Vy&y N , (12) 

n =1 

where y n denotes the n-th entry in the collection of samples 
y, and each element Y n follows the identical statistical model 


q{Vn,0) = q(y;6), Vn e {1,2,..., N}, (13) 

the right hand side of ( ITO simplifies to 

= v/ y (2hf^) 2 » (! ,; W! ,. (14) 

The left hand side of ( lilt is identified as the mean squared- 
error msey (9) of the estimator 9{y), such that the Cramer- 
Rao inequality a a for unbiased estimators 


msey ( 9) = vary (6) 

1 

- NF y (0) 


(15) 


is obtained. Consequently, the Fisher information, defined by 

Fr (0) = j y (— (16) 

is a measure for the amount of intrinsic information about 
the unknown deterministic parameter 9 contained in average 
within each observation of the random output Y. It can be 
interpreted as the average contribution of each measurement 
y to the reduction of the uncertainty vary (9) about the 
parameter 9 DU. Note, that the Fisher information measure 
also plays an important role for performance bounds in the 
Bayesian setting Ha-Ha, where 9 is considered to be a 
random variable. A comprehensive overview on this topic, 
which is out of the scope of this article, can be found in lfl6l . 


B. Relative Inference Capability 

As the inequality (fT5l > holds for all estimation procedures 
satisfying (Q} and asymptotically in N attains equality when 
the estimator 6(y) is efficient, the Fisher information measure 
(Ell can be used to unambiguously assess the relative estima¬ 
tion theoretic quality of the modification p(z;9) with respect 
to the reference q(y 1 9) by the information ratio 


X(0) = 


F z {9) 
F y (9)' 


(17) 


Note that F z {9) is the Fisher information ( 1 1 6b evaluated on 
Z with respect to the conditional probability function p(z: 9). 


C. Fisher Information Bound 

Using x{6) f° r the design and optimization of the mea¬ 
surement system requires to compute (fl6l l for the benchmark 
experiment q(y 1 9) and all modifications p(z ; 9) which are 
of interest. If p(z; 9) takes a complicated form this can 
become difficult. In a situation where the parametric model 
p(z; 9) governing the statistics of the output Z is unknown, a 
direct analytical formulation of the information measure (fl~6b 
becomes impossible. However, if the first moment 

Pi{9) = J zp z (z-9)dz (18) 

of the system output Z and the second central output moment 


P 2 (0) = [ (z-Ri(0)) 2 p z {z;9)dz, (19) 

J z 

are known and are both differentiable in 9, it has recently been 
shown that the Fisher information F{9) is in general bounded 

from below uni 


Fz{9) > 


1 

F2(0) 


dpi{oy\ 

d9 J 


( 20 ) 


While examples can be given where (l20l > holds with equality 
D3, a simple counter example is immediately constructed. 
To this end, consider the system output to follow the generic 
parametric Gaussian distribution 


P(z;6) 


1 y-niW ) 2 

- - g 2 /a 2 (0) 

x/271722 (9) 


The exact Fisher information is m pp- 47 ] 


F z (9) = 


1 (dpi(9) 


P2{0) V 


89 


2 pm \ 


( dp 2 {9) \ 
89 




and is equal to (f20l) only for the special case where 


8p 2 {9) 

89 


( 21 ) 


( 22 ) 


(23) 


Obviously the inequality (l20t does in general not take into 
account the contribution provided by the variation of the sec¬ 
ond output moment p 2 {9) to the Fisher information measure 

F z {9). 


D. Contribution and Outline 

Motivated by this insight, we aim at a substantial improve¬ 
ment of our lower bound for F{9), which we provided in 
our previous discussion 03- We achieve this by utilizing the 
Cauchy-Schwarz inequality @ under a generalized approach 
and subsequently maximizing the resulting expression in order 
to attain an alternative information measure S(6). The pro¬ 
posed pessimistic approximation for F{9) exclusively contains 
the first four output moments in parametric form. A discussion 
for situations like (1231) shows that the inequality (l20l) is 
contained in the result as one special case. Using various 
examples with continuous and discrete system outputs, we 
verify the quality of the alternative information measure S(9). 
In order to demonstrate possible applications of the result and 
further insights, through S(9) we approximately determine the 
estimation theoretic information loss when squaring a standard 
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Gaussian input distribution and advance on the discussion 
about minimum Fisher information DU, H3-ED. Finally, 
we mimic a situation of practical relevance. Measuring the 
output moments of a soft-limiting device with standard Gaus¬ 
sian input, we demonstrate how to conservatively establish 
the intrinsic inference capability F(9) of a non-linear signal 
processing system when the analytic form of the parametric 
output statistic p(z; 9) is not directly available. 


II. Improved Fisher Information Bound 

For the discussion we additionally require the central output 
moments 


9 - 3 ( 9 ) = J (z - iu, 1 (9)) 3 p(z-,9)dz 
94 ( 9 ) = J (z - m(9)) 4 p(z;9)dz (24) 

and their normalized versions 

p - {z ’ e)iz 

= 83 (9)p~ i (9) (25) 

Me) = L (^Sfr) 

= iu(9)i^\9). (26) 

Note that [13 (0) is refereed to as the skewness, an indicator 
for the asymmetry of the output distribution p(z; 9), while 
94 ( 9 ) is called the kurtosis, a characterization for the shape of 
the output distribution p(z\ 9). Both moments stand in relation 
through Pearson’s inequality (22) 

94(9) >fil(9) + l. (27) 

A compact and elegant proof on ( 1271 can be found in (23). 


With the manipulations 

' z — 


9i( 9)\ dhi Pz (z-,9) 

vm)~ — p * (z;8)d 


z = 


1 


(l 


V92(9) 

1 

vOT) \ de 

1 891 ( 6 ) 

V® d6 


f 8 p z (z; 9) rn ^ f dp z (z; 9) 

4 m d3_f “( s) —Se— 


zp z (z;9)dz - 9 i(9 ) 


d_ 

86 


Pz(z;9)dz 

: J 

(31) 


and 


z - 9 i( 9 ) Y dlnp z (z;9) 
V 92 (d) 


89 

f 2 dp z (z;9) f 

L‘ 


p z (z\9)dz = 


92(9) 


dpz(z\9) 
89 


dz 


2 ta \ [ d Pz(z\9) ' 

l ‘ M J z ae d 7 


1 


92(9) 

1 (JL 


^9 / z2pz ( z ' e ^ z ~ 2lJ,1 ^)§o J z Pz(z-,9)dz 


92(9) \ 

1 dp 2 (9) 


M (»(*) + (.?(«))-2W(»)^ 


(32) 


92(9) 89 

where we use the fact that 

J z 2 p z (z; 9)dz = 92 (9) + 81 (9), (33) 

the identity 

J f(z-,9)g(z;9)p(z;9)dz = 

1 891 (d) /3(9) 892 (d) 


i] 92 ( 9 ) d9 92 ( 6 ) 89 


(34) 


A. Generalized Bounding Approach 
We apply the inequality (j9j with 

/M) = a -Y*r 1 

and 


is found. Note that 


o m Y ln P( z '’ 0 ) t m, 

8(9) -H7I- p(z\9)dz = 


89 


=m 


8\np(z-,9) 

89 


p(z-9)dz 


(28) 


= P(9)tIq J z P( z ',9)dz 


= 0 . 


(35) 


g( z ;9) 


z ~ 9 i(9) \ 

V 82 ( 9 ) J 


+m 


Z - 9i(9) \ 2 
V82(9) J 


m, 

(29) 


where j3(9) € K, in order to lower bound the Fisher informa¬ 
tion 


F (9) = jj 2 (z ] 9)p(z-,9)dz. (30) 


Taking into account that 

r ( z- 91 ( 9 ) 

Jz V 1/92(6) 

we get 

[ g 2 (z-9)p(z-,9)dz = 


p(z ; 9)dz = 0, 


(36) 


= 1 + 2/3(9)83(9) + /3 2 (9)~94(9) - P 2 (9). (37) 
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Therefore, from ([9j. ( |30l ). ( |34) and ( ITTI ) it can be shown, that 
the Fisher information can in general not fall below 

(iz f( z ’ 0 '>9(z;O)p(z;6)dz^j 


m> 


f z g 2 (z-,0)p(z;6)dz 


1 


( dm(6 

\ oe 


(0 


yj 8-2(8) 


d/J.2 (8) y 


08 


p 2 {o) 1 + 2m^(o)+p 2 m^( 0 ) - 1 ) 


(38) 


B. Optimization of the Information Bound 

The expression (138) contains the factor 3(0) which can be 
used to improve the lower bound. For the trivial choice of 
(3(6) = 0, the expression becomes 


m> 


1 f dpi{0) 

p 2 (0)\ 80 


2 


(39) 


which turns out to be the bound ( |20) discussed in IfTTl . In 
order to improve this result, consider that the problem 


r* = argmaxWx) (40) 

xGR 


A. Constant First Moment 

For the situation where the first moment p\ (0) does not 
vary with the system parameter 6, i.e.. 


dpi {9) 

80 


VO G 0, 


we attain 


13* (0) = 


1 


such that a pessimistic approximation for F{0) is 


S(0) = 


_ 0/12(6) V 


il3(8)y/iM8) 98 J 


d2(0) 


1-2 


(M4(0)-l) 

Fi(8) 


1 



M 2 ( 0 ) M 0 ) ~ fK 0 ) - 1 


(45) 


(46) 


(47) 


Note that inequality (ITTI) assures that S(0) stays positive under 
these circumstances. 


with 


h(x) = 


(a + xbf 


1 + 2xc + x 2 d ’ 

has a unique maximizing solution 

, ac—b 


x = 


be — ad 

Consequently, the tightest form of (l38l is given by 


m> 


1 


0/ii(8) 

08 


0*(8) 0u2(8) \ 

+ yum 98 


J 


(41) 


(42) 


p 2 (0) 1 + 2(3*(0)p 3 (0) + p* 2 (0)(fh(0) - 1) 


= S(0), 


(43) 


with the optimization factor 


P*(0) = 


yj 8-2(8) 


dux(8) - _ 1 0/22(8) 

98 VJM8) 98 

VM 0 )-V(M 0 )- 1) 




SwW r- 
08 




(44) 


The inequality (143) states that the derived information measure 
S(0) is always dominated by the Fisher information measure 
F(0). Therefore, S(0) gives a cautious approximation for 
F(0). Note that the Fisher information F(0) requires to 
integrate the squared-score ( 81n gg Z ’^ ) . In contrast, the alter¬ 
native measure S{0) exclusively requires the first four central 
output moments pi(0), p 2 (0 ), p 3 {0), Pa{0) in parametric 
form. 


III. Fisher Information Bound - Special Cases 

In order to derive simplified forms of the presented infor¬ 
mation measure S(0), let us consider some special cases. 


B. Constant Second Moment 

When the second moment p 2 (0) is constant within 0, i.e., 


it holds that 


dp 2 (0) 

dO 


VO G 0, 


M3(0) 




0*4(0) -1) 


In this situation 


(48) 


(49) 


S(0) = 


( any w y 


/X2(0)l - 2 WFTT 


(y) ; 


p 2 {0) 1 _ _/ j l( e ) 


(44(0)-!) 


(44(0-1) 


(50) 


Note that (150) equals the expression in (120) whenever the 
skewness p 3 vanishes. In general the relation ([27) makes ([5(1 
larger than the unoptimized bound (120) . 


C. Symmetric Distributions 

For symmetric output distributions with zero skewness, i.e.. 


M 0 ) = 0 , 


(51) 


we verify that the optimization of the information bound 
derived in (143) results in 


/8*(0) 


0U2(8) 

08 


^m^io) { -p A{ o)-iy 


(52) 
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such that 


S(9) = 


( ftEL 
1 V dl 


(G , l 66 ) 
de ' omm 


39 


M2(9)(M4(9)-1)/ 


M2(0) 


1 + 


( d ^i 

\ ““751 


9/x 2 (0) \ 2 

W-7^-7^) (MO)- 1) 




ft#) ^2(g)(A4(g)-l)+(^) ; 


M |(0)(M4(^) ^ 1) 


1 / dm{6) 

M0)\ dd 


(d^.(0)\ 2 


Mo) (MO) -1 ) V ^0 / 


o rn 1 ( d Mo ) x2 


p(z;0) = 


1 


- 0 2 "2 (0) 


y/ZnMOj 

The exact Fisher information measure is given by 


F(0) = 


i / 91 / 1 ( 0 ) 


i 


/ 9u 2 (0) 


i/ 2 (0) V dO ) ^ 2z/f(0) V 99 

As for this case the output moments of interest are 

MO) = MO) 

MO) = Mo) 

MO) = o 

MO) = 3, 


we get the approximation 


1 


( 9MQ) \ 


MO) V 99 J M0)(M0) -1) V 99 J 

1 / dv\{6) \ 2 1 / dv 2 (0 ) N 2 

1/2(0) V 99 ) + 21/|(0) ^ 90 


(61) 


which is obviously a tight lower bound for the original 
information measure F(9). 


(53) B. Exponential System Output 


Again note that according to Pearson’s inequality (|271 > 

MO) - 1 > 0, (54) 

such that the expression (|53| > always takes a positive value. 


As another example we analyze the case where samples 
from a parametric exponential distribution 


p(z;9) = i/(0)e-" (e)2 , 


(62) 


D. Simplifying Characteristic 
For the case where the identity 

»(*> = ' v " e e - ,55) 

holds, the optimization of (l43l ) results in 

P*(0)= 0 (56) 

and the approximation obtains the compact form ( f20t 


(57) 


This situation occurs for example for a symmetric output 
distribution with constant second moment. 

IV. Approximation Quality - Continuous Outputs 

In order to demonstrate the quality of the derived lower 
bound 5(0), we consider different examples where F(9) can 
be derived in compact form. First we discuss several well- 
studied distributions with continuous support Z. 

A. Gaussian System Output 

Consider the system output Z to be the undisturbed obser¬ 
vation of a generic Gaussian distribution in parametric form 


with i/(0) > 0 and 2 > 0, can be collected at the random 
system output Z. The score function under this model is 

d lnp(z; 0) 1 01/(0) _ 9u(0) 

90 u(0) 90 “ 90 ’ ^ ^ 

such that the Fisher information is evaluated to be 

f f9lnp z (z-,0)\ 2 , 

{ ) = Jz [ - 99 -J PM0)dz 

i (9M)\ 2 

v 2 {9) V 90 ) ' 

For the approximation 5(0) the required moments are 

M0)= 1 

= u 2 (9) 

MO) = 2 
MO) = 3, 


(64) 


v (0) 

1 


(65) 


such that 


9MO) /— jfft- _2 dv{0) 

90 vM^M ) u3 ^ de 

_ 9p 2 (0) 

90 ’ 


( 66 ) 


(58) 


(59) 


producing ft* (0) = 0. The approximation is therefore given 
by the simplified form 

1 f dp,\{6) 


= v 2 (0) - 


1 


9 i/( 0 ) V 


u 2 (0) 90 J 

1 (9m\ 2 

i/ 2 (0) V 99 J ’ 


(67) 


(60) 


which obviously matches the true Fisher information F(0) in 
(El exactly. 
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C. Laplacian System Output 

For a third example, we assume that the output Z follows 
a parametric Laplace distribution with zero mean, i.e.. 


p{z\0) = 


1 


21/(0) 

The score function is given by 

d\np(z;0) 1 dv(0) 




dv{0) 


dO 


v(0) 80 v 2 (0) dO 


( 68 ) 


(69) 


and the exact Fisher information is found to be 

m-fX*^)**** 

1 (dv(0)\ 2 

u 2 {0) \ dO ) ' 


(70) 


The first four moments of the output Z are 


M0) = o 

p 2 (0) = 23(0) 

MO) = 0 

MO) = 6. (71) 


As the first moment is constant with respect to the system 
parameter 6, the approximation takes the form 


S(0) 




1 

V dS ) 

3(0) (MO) - 1) 

i 

MW 

4u 4 (0) 

5 

4 1 

(d30 )\ 2 

5 u 2 (0) 

V d0 ) ■ 


(72) 


In contrast to the other examples, the information bound S(0) 
is not tight under the Laplacian system model. However, S(0) 
still allows to obtain a pessimistic characterization for the 
Fisher information measure F(0). 


V. Approximation Quality - Discrete Outputs 

In the following we extend the discussion on the bounding 
quality of S(0) to the case where the system output Z takes 
values from a discrete alphabet Z. 


A. Bernoulli System Output 

As a first example for such kind of system outputs, obser¬ 
vations from a parametric Bernoulli distribution with 


p(z = l;0) = l-p(z = 0; 0) 

= 30), (73) 

are considered, where 

0 < 1/(0) <1, V0 e 0. (74) 


The Fisher information measure under this model is 

2 


m = 


/<91np(z; 0) 


= £ 


80 


( dp(z; 0) 


V dO 


p(z;0)dz 


1 


( 


dp(z—l;6) \ 2 


36 


Y , ( 


p(z-,0) 

dp(z—0;6 ) ^ 2 


36 


p(z = 1; 0) p(z = 0; 0) 
1 ( dv{0) \ 2 

v{0){l-v{0))\ dO J ■ 

The first two moments are 

MO) = v(Q) 

MO) = u(0)( l - u(0)), 


with derivatives 


dMO) 9i /( 0 ) 


dO 

dp 2 {0) 

dO 


dO 


= (1 -MO)) 


01/(0) 
dO ' 


The third normalized moment is 


MO) = 


z - MO) 

MMO) 

i - u(0) 


P(z;0) 
3 


+ 


\A/(0)(i - u(0)) 

-3Q) 

V30)( i - 30 )) 

1 - 2u(0) 

330)3 - 30)) 

and the fourth normalized moment 


30) 

3 


(i -30)) 


MO) = 


z - MO) 

M3W) 

i - 1 /( 0 ) 


+ 


330)3 - 30)) 
-3Q) 

330)3 - 30)) 

i 

-3. 


P(z;0) 

30) 

4 


3-30)) 


30)3-30)) 


As 


dMO) 


3 MO) MO) = (i - 2 30 )) 

_ dp 2 (0) 

dO 


(75) 


(76) 


(77) 


(78) 


(79) 


(80) 


and consequently /3*(0) = 0, the approximation takes its 
simplified form 

2 


o fn) 1 fdpi(0)V 


1 ( 930) \ 

30)3 - 30)) V do J 


(81) 
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It becomes clear that also for a binary system output Z, 
following a parametric Bernoulli distribution, the derived 
expression S(9) is a tight approximation for the original 
inference capability F(9). 

B. Poissonian System Output 

As a second example with discrete output, we consider 
the Poisson distribution. The samples z at the output Z are 
distributed according to the model 


C. Hard-limited Gaussian System Output 

As a last discrete example, we consider the output Z of a 
hard-limiting device [24], i.e., 

Z = signer), (91) 

where the generalized signum operator is defined by 

+1 if x > 7 

— 1 if x < 7. 


sign (x) = 


(92) 


with 


and 


p (z;9) = 

(82) 

z\ 


Z = { 0 , 1 , 2 ,...} 

(83) 

i/(9) >0, V 6 > G 0. 

(84) 


As input Y to the hard-limiter, a generic parametric Gaussian 
distribution 


, 1 (y-vi(0)) 2 

p(y,6) = — , , , e 


(93) 


y/2wi/ 2 (9) 

is used. The conditional probability mass function of the 
binary output Z in this experiment is 

7- i/ ± (9) 


p{z = i;0) = Q 


The second derivative of the log-likelihood is given by 

/d 2 v(6) (a \ (dv(B)\ 2 \ 

d 2 lnp z (z;9) ( oe 2 ^ de J \ 8 2 t/(9) 


89 2 


= z 


v 2 (9) 


I 89 2 ' 
(85) 

With the mean of the system output being 

E [Z]=v{9), ( 86 ) 

we calculate 


with 


p(z = —1; 6 >) = 1 — Q 


Q(x) = 


\/v 2 (9) J 

7-^1 ( 9 ) 




\Jv 2 (9) 


e 2 df 


(94) 

(95) 

(96) 


being the Q-function. Note that the derivative of the Q-function 
is given by 


F(9) = 


/ 9 In p{z\0) 

V 89 
f 8 2 \np(z-,9) 

L 89 2 


p(z;6)dz 

p(z;9)dz 


8 Q (x) 1 _PL 

— =e 2 . 


(97) 


(87) 


1 

i/(9) V 89 ) ' 

In order to apply the approximation S(9), we require the 
moments which are given by 

Mi ( 9 ) = K 0 ) 
li 2 (0) = i/(9) 

1 


8x v^F 

The corresponding derivatives of the conditional probability 
mass function in this example take the form 

8p(z = l-9) 


89 


\[Znv 2 (9) 

8p(z = —1; 9) 8p(z=l-,9) 

89 ~ 86 ’ 

Thus, the exact Fisher information F(9) is found to be 


M3(0) = 


V"W) 


and 


(98) 


(99) 


1 


M6) ~W) + *' 

As these quantities exhibit the property 

-^-VP2(9)p 3 (9)-^- 
_ 8p, 2 (9) 
89 ’ 


( 88 ) 



(89) 


we obtain /?* (9) = 0 and the approximation for this example 

or/* _ 1 


F(t)) = 


s ”* , > Q ( : 52S0( 1 - Q ( 3 5=£? 

( 100 ) 

For the approximation S(9), we calculate the first two output 
moments by 


p 2 (9)\ 89 
1 /8i/(9)\ 2 
= ~v(9) V 89 ) 


( 90 ) 


is tight with respect to F(9). 


Mi( 0 ) = P(z = 1; 9) - p(z = -1; 9) 

= 2 Q (Vpw ')_ 1 

l \ZMo) J 


( 101 ) 
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and 


Therefore, the information bound is given by 


V 2 { 0 ) = - m{0)) 2 p(z-, 0) 


= 4^1 —Q ^ 
+ 4 ^1 — Q 
= 4| 1 — Q ( 


V^2 W) J 

\ y/MO) / 

VM8) ) 


Q 


Q 


7 ~ u i(Q) j 

, VMd) ) 

2 ( 7-^lW 


Q 


\Jv 2 (0) 

7 — z^i(6>) \ 

, \M(0) ) 


( 102 ) 


The third and fourth moment in normalized form are given by 

3 

vW#) / 

1 — 2 Q 




i>(*;0) 


y/vi(8) 


1 - Q 


7~i/i(9) 

\A / 2(9) 


Q 


7—i/i(fl) 
y/^(B) 


(103) 


and 


S(6>) = 


1 /<fyi(#) 

M#) ^ 


90 


) : 




v 


y/v2(8) 


27TI/f (0) Q 


7 —1/1(9) 

\/ 1 / 2 ( 6 ) 


1 - Q 


7—1/1(9) 

\/ 1 / 2 ( 0 ) 


(108) 


Comparing this with the expression (1 1001) for the exact in¬ 
formation measure F(6), it can be concluded that also for 
a generic hard-limited Gaussian distribution the information 
bound 5(0) is a pessimistic approximation for the Fisher 
information F{6) with extraordinary quality. 


VI. Applications 

Finally, we want to outline possible applications of the 
presented approach and the opportunities provided by an 
information bound like 5(0). To this end, we present three 
problems for which 5(0) provides interesting and useful 
insights. The discussed problems cover theoretic as well as 
practical aspects in statistical signal processing. 


/Li 4 (0) 



1 


1-Q 


7-t/i (9) 
y/ 1/2(9) 


Q 


7—1/1(9) 
y/ 1/2(9) 


- 3. 


With the derivatives 


A. Worst-Case Noise and Minimum Fisher Information 

An important question in signal processing is to specify 
(104) the worst-case noise distribution under the considered system 
model l25l . A common assumption in the field is that noise 
affects technical receive systems in an additive way. Therefore 
a model of high practical relevance is 


dm ( 0 ) 

90 


and 


(7-1-1 W) 2 

2e 




y/2ttv 2 (0) 


(7~i/i(9)) di / 2 (9) ^ 

2yffA8) 08 ) 


(105) 


Z = x{0) + W, (109) 

where x{6) is a deterministic pilot signal modulated by the 
unknown parameter 0 (for example attenuation, time-delay, 
frequency-offset, etc.) and W is additive independent random 
noise. Without loss of generality it can be assumed that the 
noise is zero mean, i.e.. 


9^(0) 

90 


4e 2 -2W ^^ 2 ^) + 

y/Zxvi ( 0 ) 


(7-1/1 (9)) di / 2 (9) 
2y/ 1/2 (8) 96 ) 



2 Q 


7- l/l(0) \ \ 

\/ V 2 (0) J / 


(106) 


we verify that 




^1 — 2 Q 


7 - ;/i( 0 ) \ \ 

J J 


(-y-^(g)) 2 
4e 2 ^2 (#) 


9 ^ 2 ( 0 ) 

90 


(^ 7 ^) + 

vfe(6») 


(7~i/i(9)) 31/2(9) ^ 

2y/u 2 (8) 90 ) 


(107) 


E [TV] = 0. 


(HO) 


If in addition the noise has the property 

E [W 2 ]=v, (111) 


i.e., the second central moment of Z is constant, it is well- 
understood, that assuming the noise component W to follow 
the Gaussian probability density function 


p(w) = 



( 112 ) 


leads to minimum Fisher information F(6) |[26l ITTl . There¬ 
fore, under an estimation theoretic perspective, Gaussian noise 
is the worst-case assumption in an additive system like (11091 ) 
with constant second output moment ED. The presented 
bounding approach 5(0) allows slightly stronger statements. 
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If for any system p(z; 9) (including non-additive systems) the 
output Z exhibits the characteristic 


W (60 = E[Z] 

= x(0), (113) 

1*2(0) = E[(Z - MtW) 2 ] 

= ^, (H4) 


the presented result shows that F(ff) can not violate 


F(9)> 


1 


( 


QMS ) \ 2 
09 J 


1*2(9) 1 


gj(g) 

(M4 (©) —1) 


(115) 


This lower bound is minimized by a symmetric distribution, 
i.e., /i3(0) = 0. The resulting expression 


m> 


1 ( dm (Q) 
92(9) \ 30 


2 


(116) 


reaches equality under an additive Gaussian system model 

1 ( z-x(e )) 2 

p(z;0) = ^=e-(117) 

s/2ttv 

such that the worst-case model assumption with respect to 
Fisher information under the considered restrictions ( 1113b 
and (11141) is in general additive and Gaussian. In the more 
general setting, where also the second output moment exhibits 
a dependency on the system parameter 9, 


9i(0)=V[Z] 

= x(9), (118) 

/z 2 (0)=E[(Z- Ml (0)) 2 ] 

= v(9) (H9) 


B. Information Loss - Squaring Device 

Another interesting problem in statistical signal processing 
is to characterize the estimation theoretic quality of non-linear 
receive and measurement systems. The Fisher information 
measure F(9) is a rigorous tool which allows to draw precise 
conclusions. However, depending on the nature of the non¬ 
linearity, the exact calculation of the information measure 
F(9) can become complicated. As an example for such 
a scenario consider the problem of analyzing the intrinsic 
capability of a system with a squaring sensor output 

Z = V 2 , (125) 

to infer the mean 9 of a Gaussian input 

1 ( y-e ) 2 

P(v'i9) =-7==e 2 (126) 

V 47r 

with unit variance. In such a case the system output Z follows 
a non-central chi-square distribution parameterized by 9. As 
the analytical description of the associated probability density 
function p(z; 9) includes a Bessel function, the characteri¬ 
zation of the Fisher information F(9) in compact analytical 
form is non-trivial. We short-cut the derivation by using the 
presented approximation S(9) instead of F(9). The first two 
output moments are found to be given by 

E [Z] = E [9 2 + 29W + W 2 ] 

= 9 2 + 1 

= 9i(9) (127) 

E [(Z - Hi(9)) 2 ] = E [(9 2 + 29W + W 2 -9 2 - l) 2 ] 

= 2(2 9 2 + 1) 

= 1*2(0), (128) 


and additionally the output distribution is symmetric, i.e.. 


where we have introduced the auxiliary random variable 


1*3 (9) = 0, 


( 120 ) 


W = Y — 9. 


(129) 


the presented result allows to conclude, that the Fisher infor¬ 
mation is in general bounded from below by 


m> 


1 1 

(dx(9)\ 

i 2 | 1 | 

(9v(9)\ 

v(9) 

l 99 ) 

1 v 2 (9)(M9) - 1) ' 

l 99 ) 


( 121 ) 


As the system model 

p(z- 9) = e-^roy- ( i22) 

v27ri/(6>) 

exhibits the inference capability 


The third output moment is 

E [(Z - m (0)) 3 ] = E [(9 2 + 29W + W 2 -9 2 - l) 3 ] 

= 8(3 9 2 + 1) 

= l*z(0), (130) 

while the fourth moment is 

E [(Z - pi(6>)) 4 ] = E [(9 2 + 29W + W 2 -9 2 - l) 4 ] 

= 12((20 2 + l) 2 + 4(49 2 + 1)) 

= l*i(0). (131) 


F(9) = 


1 | 

(9x(0)\ 

I 2 , 1 1 

(9v(9)\ 

v(0) 

\ 99 ) 

' 2^2(0) 1 

\ 99 ) 


(123) 


it can be concluded together with (l 27 l> that for all cases where 

1 < in(9) < 3, (124) 

the worst-case system model p(z; 9) with respect to parameter 
estimation is the parametric Gaussian one (11221 ). 


The normalized versions of the third and fourth moment are 


1*3(9) = f*3(9)l* 2 2 (9) 

_ 8(36> 2 + 1) 

~ 2^(202 + 1)5 
_ 2 v / 2(30 2 +1) 


(20 2 + 1)5 


( 132 ) 
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and 


Im( 0) = IM (0)p 2 2 (0) 


12((20 2 + l) 2 + 4(40 2 + l)) 
4(20 2 + l) 2 
12(40 2 + 1) 

(20 2 + l) 2 


With the derivatives 


dm ( 0 ) 

dO 

d 112(0) 
d0 


3. 

= 20 
= 80, 


033 ) 


034 ) 


we obtain 


P ^ dm(6) 


gg-fl 3 (0) - 112(9) (P-4 0 ) - !) 


0 2 V^/(20 2 + 1) 

(40 4 + 160 2 + 3) 

and the approximation is finally given by 


035 ) 


S(0) = 


1 


dm (6) 

oe 


P*{8) aMe) \ 
98 ) 


V*(0) 1 + 2f3*(0)ft 3 (0) + p*\0)(M0) - 1) 


20 2 (40 4 + 120 2 + 3)‘ 


~ (40 4 + 120 2 + 3) (8 0 6 + 240 4 + 180 2 

20 2 (40 4 + 120 2 + 3) 

~ (80 6 + 240 4 + 180 2 + 3)' 

Fig. D] depicts the approximative information loss 

x(0)=* W 


3) 


036 ) 


037 ) 


Fy(0y 

when squaring the random input variable Y. As a comparison 



Fig. 1. Non-linear Systems - Performance Loss 

also the corresponding loss for a symmetric hard-limiter ( |9H 
with 7 = 0 is visualized. It can be observed that for low 
values of 0 the information about the sign (hard-limiting) of 


the system input Y conveys much more information about the 
input mean 0 than the amplitude (squaring). For 0 > 0.75 the 
situation changes and the squaring receiver outperforms the 
hard-limiter when it comes to estimating the mean 0 of the 
input Y from samples of the system output Z. 

C. Measuring Inference Capability - Soft-Limiter 

A situation that can be encountered in practice is that 
the analytical characterization of the system model p(z; 0) 
or its moments is difficult. If the appropriate parametric 
system model p(z; 0) is unknown, the direct consultation 
of an analytical tool like the Fisher information measure 
F(0) becomes impossible. However, in such a situation the 
presented approach of the information bound 5(0) allows 
to numerically approximate the Fisher information measure 
F(0) at low-complexity. To this end, the moments of the 
system output Z are measured in a calibrated setup, where the 
parameter 0 can be controlled, or determined by Monte-Carlo 
simulations. We demonstrate this validation technique by using 
a soft-limiter model, i.e., the system input Y is transformed 

•■M** 

where ( S I is a constant model parameter and 

2 r x 2 

erf (x) = —= / e~* d t (139) 

V ■ 7r Jo 

is the error function. This non-linear model can for example 
be used in order to characterize saturation effects in analog 
system components like low-noise amplifiers. In Fig. [2] the 



Fig. 2. Soft Limiter Model - Input-to-Output 


input-to-output mapping of the model (11381) is depicted for 
different setups (. As input we consider a Gaussian distribu¬ 
tion with unit variance like in (11261) . The output moments 
pi(0), ^2(0), p 3 (0), p4(0) are measured by simulating the 
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non-linear system output Z with 10 9 independent realizations 
for each considered value of the input mean 9. The result is 
shown in Fig. [3] After numerically approximating the required 


4 


2 


0 


-2 



9 


Fig. 3. Soft-Limiter Model - Measured Moments (£ = 0.5) 

derivatives ^ W ^ d ^ e \ which are depicted in Fig. [4] the 
approximation S(9) is calculated. In Fig. 0 the measured 



Fig. 4. Soft-Limiter Model - Measured Derivatives (£ = 0.5) 


information loss x(0) of the soft-limiter model is shown, 
where the dotted line indicates the exact information loss %(0) 
with a hard-limiter ( |9H which is equivalent to a soft-limiter 
with ( —>■ 0. 


VII. Conclusion 

We have established a strong and generic lower bound for 
the Fisher information measure. By various examples we have 
shown that the derived expression has the potential to provide 
a good approximation in a broad number of cases. This makes 
the presented information bound a versatile mathematical tool 
for a variety of problems encountered in the design and opti¬ 
mization of signal processing systems. Further, the pessimistic 



Fig. 5. Soft-Limiter Model - Information Loss 

nature of the attained alternative information measure allows 
to strengthen insights on worst-case noise and to generalize 
classical results on Gaussian system models which exhibit 
minimum Fisher information. Finally, we have outlined how 
to use the presented information bound in order to bench¬ 
mark physical measurement systems with output statistics of 
unknown analytical form. 
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